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Abstract 



We introduce tools for inference in the multifractal random walk introduced by Bacry et al. (2001 1. These tools 
include formulas for smoothing, filtering and volatility forecasting. In addition, we present methods for computing 
conditional densities for one- and multi-step returns. The inference techniques presented in this paper, including 
maximum likelihood estimation, are applied to data from the Oslo Stock Exchange, and it is observed that the volatility 
forecasts based on the multifractal random walk have a much richer structure than the forecasts obtained from a basic 
stochastic volatility model. 
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1. Introduction 

Modeling financial time series by stochastic processes 
dates back to the work of'Bachelier ( 1900 ). Bachelier pro- 



posed to model the price of a financial asset as a Brownian 
motion with drift. It was later realized, by e.g. |Mitchell| 
( (TgiSj ), that the standard deviation of price changes are 
proportional to the price levels themselves. Therefore, 
Bachelier's model should be modified so that it is the log- 
arithmic asset price, X{t) - log P(t), that is modeled as 
a Brownian motion with drift. As a modification of this 



model, Mandelbrot ( 1963 1 proposed to replace Brown- 
ian motion with a-stable Levy processes with a < 2, so- 
called Levy flights. 

Both Brownian motions and Levy flights are selfsimi- 
lar and have independent increments. However, empirical 
analyses of asset prices have revealed that, even though 
logarithmic returns are uncorrelated, they are nevertheless 
strongly dependent. This stylized fact is called volatility 
clustering, and it is not well described by Brownian mo- 
tions nor Levy flights. Other processes, such as stochas- 
tic volatility (SV) models, are specifically designed to in- 
clude thi s feature. The simplest example is the basic SV 
model of Taylor ( 19821. If we choose[^// - 0, this model 
is defined by the stochastic differential equation 

dX(t) = o-(f) dB(t) , 
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where the logarithmic volatility varies according to an 
Ornstein-Uhlenbeck process, i.e. 



d log cr(f) = -a log o-(t)dt + vdB(t) , 



(1) 



where B(t) is a Brownian motion independent of B{t). 

Another class of SV models are the multifractal ran- 
dom processes. These models come from turbulence the- 



ory, and their origin can be traced back to works of |Kol- 
mogorov ( 1962| l and Obukhov ( 1962 1. The defining prop- 
erties of a multifractal process X(t) are stationary incre- 
ments and structure functions that are power-laws in time, 
i.e. 

E[|^(OI*] ~ f^**^ ■ (2) 

The scaling functions ^(q) are linear for selfsimilar pro- 
cesses, but usually the term "multifractal" refers to the 
cases where ((q) are strictly concave. For such processes, 
the absolute values of the increments of X(t) may have 
algebraically decaying auto-correlation functions (ACFs), 
even though the increments themselves are uncorrelated. 
In contrast, the ACFs for the absolute values of the incre- 
ments decay exponentially in the basic SV models. 

That multifractals represent a suitable framework for 
modeling financial time series was first discovered about 



fifteen years ago by Ghashghaie et al. 


(1996) and 'Man- 


delbrot et al. ( 1997 1. Shortly after fliis 


Calvet and Fisher 



showed how one can obtain a discrete-time SV 
model as a discretization of a continuous-time multifrac- 
tal. 

The model constructed by Calvet and Fisher is called 
the Markov-Switching Multifractal (MSM), and it is con- 



structed by randomizing the so-called multiplicative cas- 
cade. The result is a model that describes log-returns as 



independent, the auto-correlation function for the process 
\x,\ becomes 



Xt — (T -s/M, e, . 



(3) 



E[|.x,Xj|] oc e 



[E[(/i,/2+/!,,/2)2] i Cov(/i„/i,,) 



Here s, ~ 7V(0, 1) are independent variables and the which for 1 <s; 5 « gives the approximate scaling 
volatility is a product on the form 



(6) 



(7) 



The variables M, ^ are updated with different frequencies 
for different levels k. To be precise, at each time step f, 
M,jt is given a new value (independently drawn from a dis- 
tribution M) with probability and left unchanged with 
probability \ -jk- The approximate multifractality in the 
MSM model is achieved by choosing = 1 - (1 - 71)'' 
for some yj e (0, 1) and some > 0. By exploiting gen- 
eral techniques for Markov-Switching models, Calvet and 
Fisher have developed inference methods for the MSM 
model, including maximum likelihood (ML) estimation 
and volatility forecasting. Unfortunately, there are some 
limitations to the applicability of these methods. One 
problem is that the likelihood functions only are avail- 
able when M is discrete, something that leads to rather 
unnatural parameterizations. Also, in practice, it is only 
possible to compute the likelihood if the parameter K does 
not exceed ^ 10 ( |Lux||2008| l. In effect, this introduces an 
unwanted exponential cutoff in the volatility dependence, 
at the time scale . 

At the same time that Calvet and Fisher proposed the 



MSM model, Bacry et al. ( 2001) presented a different type 
of multifractal process, the so-called multifractal random 
walk (MRW). A popular discrete-time approximation to 
this process is given by equation ([3]), with 



M, = ce'''. 



(4) 



where hj is a stationary and centered Gaussian process 
with co-variances 



T 



t-s + 



l)Af 



(5) 



The constant c is chosen so that 1/c = E[e''']. If the 
step-length Af is fixecj^ it is convenient to denote R - 
T/At. The model then depends on three parameters: 

For the purpose of modeling financial time series, an 
important property of the MRW model is the slow decay 
of the volatility dependence. Since the innovations are 



"The variable t is dimensionless and represents the number of time 
steps of length At. 



The parameter A is called the intermittency parameter, and 
it also determines the nonlinearity of the scaling func- 
tion. In fact, the scaling function of the (continuous-time) 
MRW model is 



q 



In contrast to the MSM model, which is obtained by 
randomizing a discrete multiplicative cascade, the MRW 
model builds on a continuous cascade. In fact, the log- 
normal MRW model that we consider in this paper is just 
a special case of a more general class of processes known 



as infinitely divisible cascades (Muzy and Bacry 2002 1. 
These processes have very desirable theoretical proper- 
ties, e.g. exact multifractal scaling. From this point of 
view, the MRW model is preferable over the MSM model, 
and it is therefore important to develop inference tech- 
niques for the MRW model. A step in this direction was 
taken in (L0vsletten and Rypdal 201 1\ , where we pre- 
sented methods for ML estimation. These results were 
obtained by observing that the processes defined by equa- 
tions ([3]) and (j5]l are very similar to discrete-time versions 
of the basic SV models. In fact, if we replace the process 
hf with an auto-regressive model of order one (an AR(1) 
process), 

ht - ij/ht-i + cr,iU, , (8) 

where m, is Gaussian white noise with unit variance, then 
the process defined by equations (j3]l and Q is a basic S V 
model. Hence we can use existing techniques for basic 
SV models (Skau g and Yu\ |2009l [Martino et al] [2011] ) 
in combination with general ML methods for Gaussian 
processes (McLeod et al. 2007 | l to obtain likelihoods for 
the MRW model. 



While in (L0vsletten and Rypdal 201 1 1 we focused on 
parameter estimation, the focus of this paper is primarily 
conditional forecasts of returns and inference regarding 
the latent variables /z,. To be more precise, we are inter- 
ested in estimating the conditional variables hs\{x,, t < T]. 
For s <T this problem is known as smoothing, for s - T 
it is called filtering, and for i > T it is called forecast- 
ing. These techniques are of obvious importance for the 
applicability of the MRW model in finance. 
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The paper is structured as follows. In section|2]we re- 
view inference techniques for basic SV models, and in 
section [3] we generalize these results to the MRW model. 
In section [4] we apply some of these methods to data from 
the Oslo Stock Exchange, and in section |5] we give some 
concluding remarks. 

2. Inference techniques in the basic SV model 

In general, many statistical problems in stochastic mod- 
eling, e.g. model selection, parameter estimation and 
assessment of uncertainty in estimates, can be solved 
by utilizing the likelihood of the model. Given data 
z - {z\,...,Zt), the likelihood X of a random vector 
X- {x\,. . . , Xt), with probability density function Px{-\0), 
is defined as the function 



(9) 



i.e. one views the the probabiUty density as a function of 
the parameters 6, with z fixed. 

Remark 1. To simplify notations we will drop the sub- 
scripts on the densities throughout the rest of the paper It 
will be clear from the arguments which densities are con- 
sidered. We also suppres the dependency of the parameter 
vector in the notation of the densities. 

In the basic SV model the likelihoods are diflicult to 
compute directly. By conditioning on the latent field h - 
(hi, ... , hj), the probability density of x takes the form 



p(x) 



p(x\h)p(h)dh , 



(10) 



where the joint density p(x, h) 
of the Gaussian marginals 



p(x\h)p(h) is a product 



and 



p(x\h) = Y\ P(xi\hi) 



p(h)^p(hi)Y]p(hi\hi^i). 



(11) 



(12) 



1=2 



The factors in equations ( [TT] i and ( 12 1 are densities Cor- 
el 9 

responding to the distributions x,\ht ~ N(0,cr^cexp(h,)), 
h ~ N(Q,(tII{\ - ijf^)) and h,\h,^i ^ yV(#,_i,(r2). In 



general the integral in equation ( 10 1 has no closed form, 
and it is typically very demanding to compute numeri- 
cally. As an approximation one may consider a second- 
order Taylor expansion of log p{x, h) around the maxi- 
mum 

/!* = argmax,, log ^(x, /;). (13) 



The resulting integral is easily computed, giving the ex- 
pression 



pAx) ^ {Infl^ I det Q(jc)r p{x, h") , (14) 



where 



Q(jc) = 



log p{x, h) I 



l/i=/i* 



(15) 



is the Hessian matrix of the map h i-> log p{x, h), evalu- 
ated at h - h*. This approximation is known as Laplace's 
method, and it has been applied, by among others 'Martino 



et al. ( 201 1\ , to compute likelihood functions in basic SV 
models. The reason for its efficiency in basic SV models 
is the Markov structure of the latent field h. The Markov 
property ensures that the gradient of /j i-> log p{x, h) is on 
the form 



5 log p{x, h) 
dhi 



= + 2^Aijhj+, 



i(xi,hi). 



(16) 



where A - ||A,j|| is a tridiagonal matrix, are constants 
and gi are non-linear functions. By exploiting the sparse- 
ness of A, one can efficiently calculate h*. In addition, the 
Hessian matrix Q.{x) is tridiagonal, making the computa- 



tion of the expression in equation ( 14 1 efficient. 

We are now in a position to make statistical inference 
based on the basic SV model. We start by looking at fil- 
tering of the volatilities. Overlooking model uncertainty 
and parameter uncertainty, the conditional density pihrlx) 
contains all available information about the latent vari- 
able hr at time T. As a point estimate one may consider 
the Bayes estimator which is defined as the maximum of 
the posterior distribution pihrlx). This density is approx- 
imated using Laplace's method : 



pihrlx) oc p(x,hT) 

p(x, h)dh\ ■ ■ ■ dhj-i 
b p{x,h\, . . . hT-\,hT^ , 

where the factor b does not depend on hj, and 

{hi,... TiT-i) = argmaX/,_ J logp(x, h). 



(17) 



(18) 



Maximizing gives the filtered estimate hj of hj. The 



filtering procedure can be written more compact as in ( 1 3 1 
with hj - hj. 

For smoothing we consider the posterior distribution 
p(hs\x), now with s < T . A similar argument as for the 
derivation of the filtering formula gives the approximated 
Bayes estimator hs - h* where /;* is component s of the 



vector h* in equation ( 13 1 



3 



We note that if we are already calculating the likelihood 
using the approach described above, then very little addi- 
tional effort is required to obtain these estimates, since the 



maxima in equation ( 13 i is found as a part of the Laplace 
approximation. 

To forecast the volatility steps into the future we fol- 
low the same procedure as for smoothing and filtering. We 
need to find the maximum of the expression 

log p(x, h, Ht+n) = log p{x, h) + log p{hT+N\h) 

as a function of (h,hj-+]^i). Iterating equation ([8]l back- 
wards yields 



N-l 



hj+N = t/^hr + ^ i//''uT 



k=0 



and hence 



hT+N\h ~ N 



+N-k ■ 



N-l \ 
2k 



k=0 



Differentiation of log pihr+Nlh) gives 

d log p{hT+N\h) (hr+N - ^^hj) 



A'-l ,1,2k 



(19) 



To find a maximum we require that the expressions in 
equation ( [T9] l equals zero, and also that V/, log p(x, h) - 0. 
From this, the A^-step volatility forecast becomes 



hj+N = Vhj, 



(20) 



where hj is the filtered estimate of hj. The formulas we 
have derived for smoothing, filtering and forecasting of 
the volatilities are the same as in |Skaug and Yu ( 2009 ). 

To conclude this section we remark that, since 
P{xt^n\x) °^ p(x,xj+ff), the conditional densities 
P(xt+n\x) can be computed simply by using the Laplace 
approximation. For > 1 one must take into account 
that the matrices A and Q.{x, xt+n) are modified due to the 
inclusion of the density pihr+Nlh). 



3. Generalization to the MRW model 

In this section we extend the results of section |2] to the 
discrete-time MRW model. In this case h, is no longer 
a Markov process. While /i, still is a centered Gaussian 
process, its covariance structure is now given by equation 

Let us first review the approximation of the likeUhood 
for the MRW model ( |L0vsletten and Rypdall|201 l| l. One 



starts with Laplace's method, given in equation ( [T4) l. The 
density pix\h) is the same as for the basic SV model, but 
the density of p(h) needs to be handled differently. We 
denote by yik) - Cov(/jo, h/^) the auto-covariance function 
of the process /;,, and let F, be the variance-covariance 
matrices of the vectors (hi,. . . , h,). That is 

r,(/,j) = r(l'-;l) for /,; = o,i,...,f- i. 

As usual when working with Gaussian vectors, it is conve- 
nient to introduce regression coefficients 
<^*'' are defined via the equations 

r,0« = r,:,. 



The vectors 



(21) 



where yi:, = (y(\), . . . ,y{t)Y . From standard theory of 
multivariate normal distributions, the conditional distri- 
butions of h,\{hs : 1 < i < f) are normal. 



ht\{hr. \ <s<t]^ N{mt,P,), 



(22) 



with means m, - {h,-\,ht-2, ■ ■ ■ ,h\)<P^'^^^ and variances 
Pf - y{Q) - y\.,_J'^l{Y\:t-\- Since the density of h can be 
decomposed into a product of one-dimensional marginals. 



equation ( 22 1 gives p(h) 



for t 



Remark 2. Solving the the equations in (21 1 
1 , . . . , r - 1 can be done iteratively using the Durbin- 
Levinson algorithm, which requires only 0{T^) floating 



point operations. We refer to ( )McLeod et al. 2007| i for 
details. 

A second difference between the basic SV model and 
the MRW model is the structure of the matrices Q and 
A, which are defined by equations (15 1 and ( 16 1. For the 



MRW model these are no longer sparse. This makes the 



computation of the expression in equation ( 14 1 extremely 
demanding. The solution is to truncate the dependency in 
the process h, after a finite number of lags. This gives the 
approximation: 

pih,\{h, : 1 < 5 < f)) ~ p(h,\{h, : t - t < s < t}) , (23) 

where r € N is a truncation parameter We note that 
for t > T, the regression coefficients and variances of 
ht\{hs : t - T < s < t} are 0^'"^ and Pj+i respectively. After 
truncation, the matrices A and Q. become band-diagonal 
with bandwidths equal t. 

Remark 3. The likelihood approximation for the MRW 
model is implemented in the R computer language. In our 
implementation we have used analytical expressions for 
the first and second order derivatives to construct the ma- 
trices Q and A. The maxima h* are found by numerically 
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Figure 1: (a): Shows e''''-, where h, are the filtered estimates of h, from the daily log-returns in the OSEBX in the time period from February 20th 
2008 to February 8th 2012. The lower curve is for the MRW model, and the top curve is for the basic SV model. The top curve has been shifted 
to make it visible. The filtered signals are plotted together with the log-returns x, of the OSEBX. (b): Shows the same as in (a), but now h, are 
the smoothed estimates of h, given all the observations of x,. (c): Shows e''''^, where is the forecast performed using data l.r, : s < t ~ N] with 
N = 10 days, (d): Shows the same as in (c), but now with N = 50 days. 



calculating the roots of the expressions in equati on ^T6\ 
using the algorithm "DF-SANE" ( jLa Cruz et al.| |2006[ ). 
This algorithm is implemented in the R package "BB" 
( |Varadhan and Gilbert] [2009 | l. To find the determinant 
of the £1 we use the package "Matrix" which efliciently 
stores and manipulates sparse matrices. 

With the likelihood approximation at hand, we can ex- 
tend the formulas for smoothing, filtering and forecasting 
to the MRW model. As for the basic S V model, we maxi- 
mize the posterior distribution according to equation ( [T3| ), 
and the formulas for smoothing and filtering are exactly 
as for the basic SV model. 

To forecast the volatilities steps ahead we need the 
conditional density of hr+Nlh- Since this variable is nor- 
mal, the distribution is uniquely given by the mean mT+N\T 
and variance Pt+n\t, i-e. 

hj+Nlh i N(mT+N\T,PT+N\T)- (24) 

The mean is a linear combination the conditioning vari- 
ables, i.e. 

mT+N\T = (hT,hT-i,...,hi)cl>^'^-^ , 



where the coefficients (p'^''"' are solutions to the equations 

Tt'C'-^-"^ = 7N:T^N-l , (25) 

with 

7N:T^N-i = iriN), y{N+l),...,y(N+T- \)f . 
The variance is given by 

Pt+N\T = y(0) - yl/.T+N-l^T7N:T+N-\ ■ 

We note that in the special case = 1 we have (p^^''^^ = 
<p^^\ and we can again use the Durbin-Levinson algo- 
rithm. In the case > 1, the explicit inverse of Fj is 
needed, and one may use the algorithm of [Trench] ( | 1 964) , 
which utilizes that the matrices are Toeplitz. Using the 
same procedure as in section|2] we get the forecasting for- 
mula 

/V+^ = (/ir,/Vi,..-,/^i)<^*^'^\ (26) 

where (Jii,h2, ■ ■ ■ ,hTj are the smoothed estimates of 
(hi,h2,...,hT). 
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Figure 2: (a): Examples of forecasts hf+N for the basic SV model (dotted curve) and the MRW model (solid curve). The forecasts are computed 
from the OSEBX data with the time t corresponding to the date February 17th 2010. (b): Same as in (a), but now the forecasts are prefoiTned for 
the date March 5th 2009. 



Using the Laplace approximation for the MRW model, 
the A' step conditional densities p(xt+n\x) are computed 
as in section 12] 



4. Examples 

As an example we have applied the inference methods 
presented in sections[2]and[3]to a time series consisting of 
daily log-returns of the Oslo Stock Exchange Benchmark 
Index (OSEBX). The data used are closing prices for the 
time period May 25th 2001 to February 8th 2012, and the 
whole time series is used to obtain ML estimates for the 
basic SV model and the MRW model. The interesting 
estimates are = 0.98 for the basic SV model and A - 
0.33 for the MRW model. 

In figure [TJa) we have plotted the filtered estimates of 
h, together with the log-returns for the time period from 
February 20th 2008 to February 8th 2012. The filtering 
for the basic SV model and the MRW model are similar, 
but not identical. The same is seen in figure [TJb), which 
shows the smoothed estimates. In figures[TJc) and[T|d) we 
have plotted the A^-step forecast for = 10 days and - 
50 days respectively. In the 50-day forecast there are some 
clear visible differences between the two models. These 
differences become even clearer in figure |2] In this figure 
we show two examples, where we (for a fixed time f) make 
future predictions h,+N, and plot these as a functions of A^. 
It follows from equation ( [20| that these curves must be 
monotonic and exponentially decaying for the basic SV 
model. This is not the case for the MRW model, and we 
observe that the forecasts based on the this model have 
much richer behavior We note that similar observations 



have been made for the MSM model ( |Calvet and Fisher] 
[200T] l. 

As explained in sections |2] and [3] it is possible to use 
the Laplace approximation to compute the full conditional 
densities for future returns. This gives forecasts contain- 
ing more information than the estimates presented in fig- 
ures [T] and |2] In figure |3] we show two examples where 
such densities have been computed. In these examples, 
the volatility is high, and the conditional densities are 
wider than the unconditioned density. In other situations, 
where the volatility is low, the conditional densities will 
be narrower than the unconditioned density. 

Remark 4. The computer code that is used for these ex- 
amples is available online at ,coinplexityandplasmas ^ 
net 

5. Conclusion 

The main results of this paper are methods for smooth- 
ing, filtering and forecasting using the MRW model. In 
addition, we have presented methods for computing con- 
ditional densities of future returns. These results improve 
on existing forecasting techniques for multifractal mod- 
els, and we therefore consider this work to be an important 
contribution to the field. 

The methods presented in this work open the way for 
several future studies of multifractal modeling in finance. 
Among the new possibilities that we consider most inter- 
esting, are model comparisons based on estimated future 
distributions. 
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Figure 3: (a): An illustration of the use of conditional densities in forecasting. The figure shows the log-returns of the OSEBX up to December 12th 
2008 and the conditional density of the log-return for the next day. (b): Shows the same as (a), but now the conditional density is a N-step forecast 
with N = 50 days, (c): Shows the same conditional density (solid curve) as is illustrated in (a). This density is compared with the unconditional 
density for the log-returns (dashed fine), (d): Same as in (c), but now for the W-step forecast with N = 50 days 
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